General Disclaimer 


One or more of the Following Statements may affect this Document 


• This document has been reproduced from the best copy furnished by the 
organizational source. It is being released in the interest of making available as 
much information as possible. 


• This document may contain data, which exceeds the sheet parameters. It was 
furnished in this condition by the organizational source and is the best copy 
available. 


• This document may contain tone-on-tone or color graphs, charts and/or pictures, 
which have been reproduced in black and white. 


• This document is paginated as submitted by the original source. 


• Portions of this document are not fully legible due to the historical nature of some 
of the material. However, it is the best reproduction available from the original 
submission. 


Produced by the NASA Center for Aerospace Information (CASI) 




10453 ROSELLE STREET 
UNIVERSITY INDUSTRIAL PARK 




(NASA-CB-147393) 
SYSTEM (Linkabit 


DIGITAL TV 
Coip.) 123 


PROCESSING 
p HC $5.50 
CSCL 


17E 

G3/17 


N76-15236 


Onclas 
08532 


SAN^^IEGO. CALIFORNIA 92121 




LINKABIT CORPOKATION 
10453 Roselle Street 
Sars Diego, CA 92121 


FINAL REPORT 
for the 

DIGITAL TV PROCESSING SYSTEM 


26 November 1975 


Contract NAS9-14561 
DRL Item No. 2 


Submitted to: 

NATIONAL AERONAUTICS AND SPACE ADMINISTRATION 
Lyndon B. Johnson Space Center 
Houston, Texas 77058 



TABLE OF CONTENTS 


Page 

Abstract ....... vi 

1.0 Introduction 1 

1.1 Background 1 

1.2 Basic System Constraints and Hardware 

Consideration ...., 2 

1.3 Scope of this Report 3 

2.0 Monochrome Video Data Compression 5 

2.1 Two-Dimensional Monochrome Video Data 

Compression 8 

2.2 Variable Rate, Motion Detecting Video Data 

Compression 22 

2.3 Transforming the Variable Rata Algorithm into 

a Fixed Rate Algorithm . 35 

3.0 Color Video Source Coding 46 

3.1 Digital Color Video Data Compression 50 

4.0 Data Error Rate Analysis . 68 

4.1 General Data Error Expression for the Two- 

Dimensionally Compressed Video Algorithm . .72 

4.2 General Data Error Expression for the Motion 

Detection Algorithm • • • • • . 75 

4.3 Uplink Baseline System Performance 80 

4.4 Downlink Baseline System Performance 88 

4.5 Concatenated Channel Coding Perfomance 

Improvements ..... 91 

4.5.1 Concatenated Channel Coding for 

All the Data . 96 

4.5.2 Concatenated Coding for Only the 

Reference Symbols . 99 

i 


TABLE OF CONTENTS (Continued) 


Section Page 

5.0 Computer Simulation 101 

5.1 Video Tape Presentation 114 

References 116 


ii 


TABLE OP FIGURES 


Figure Page 

2.1 Typical Response of Logarithmic DPCM to 

Step Error 16 

2.2 Hadamard Component Designation 21 


2.3 Compressed Bit Rate vs. Motion Detected ... 33 

2.4 Uplink Video Source Encoder 41 

2.5 Uplink TV Source Decoder 42 

2.6 Typical Hardware Organization of a Field 

Rate Buffer 44 

3.1 Transformation of Color Sample into 

Y, I and Q Coordinates 48 

3.2 Modulated Composite Color Signal 51 

3.3 Orientations of Subpicture Sample Points ... 56 

3.4 Designations of the Chrominance Hadamard 

Components 61 


3.5 Color Video Source Encoder 64 

3.6 Color Video Source Decoder 65 


4.1 


4.2 


4.3 


4.4 


4.5 


4.6 


Error Probagation Effect of the D.C. 
Components of a Typical Line Group . . . . . 

Bit Error Probability Performance of a K=7, 
R=l/3 Convolutional Coding System Obtained 
by Simulation 

Small Bit Error Probability Performance of 
a K=7, R=l/3 Convolutional Coding System 
Obtained from Bounds 

Noisy Channel Performance of the Uplink 
Baseline Source and Channel Coding 
System 

Noisy Channel Performance of an Uncoded 
Uplink Digital Video Communications 
System with 7 Bits/Picture Element . . . . 

Bit Error Probability Performance of a 
K=7 f R=l/2 Convolutional Coding System 
Obtained by Simulation , . . . . 


70 


85 


86 


87 


89 


92 


iii 


Figure 


Page 


•4.7 Small Bit Error Probability Performance of 
a K=7, R=l/2 Convolutional Coding System 
Obtained from Bounds . . 93 

4.8 Noisy Channel Performance of the Downlink 

Baseline Source and Channel Coding 

System 94 

4.9 Noisy Channel Performance of an Uncoded 

Downlink Digital Video Communications 
System with 13 Bits/Color Picture 


Element 95 

4.10 Unplink Concatenated Coding Bit Error Rate 

Performance 97 

5.1 LIM Video Controller Functional Block 

Diagram 102 


5.2 Typical Geometrical Orientation of Reference 

and Processed Frames 104 



TABLE OF TABLES 


TABLE PAGE 

2.1^ . Quantization Table 18 

2.1. Quantization Table (Continued) 19 

2.1. Quantization Table (Continued) 20 

3.1. Quantization Table... 59 

3.1. Quantization Table (Continued) 60 

4.1. Error Probability Bounds 82 

5.1. Percentage of Motion Regions Detected.... 115 


V 


ABSTRACT 


Two digital video data compression systems are pro- 
vided in this report which are directly applicable to the 
Space Shuttle TV Communication System. 

1) For the uplink a low rate, 1 Megabit/sec, mono- 
chrome video data compressor is used. The data compression 
is achieved by using a motion detection technique in the 
Hadamard domain. The resultant source rate is directly 
proportional to the motion contents of the reproducing 
scene. To transform the variable source rate into a fixed 
rate, an adaptive rate buffer is provided. The rate buffer 
size is approximately 600K bits and is required only at the 
encoder. 

2) For the downlink a color video data compressor 
is considered. The data compression is achieved first by 
intra-color transformation of the original signal vector, 
(R(t) , G(t) , B(t)), into a vector, (Y (t) , I(t), Q(t)), which 
has lower information entropy. Then two-dimensional data 
compression techniques are applied to the Hadamard trans- 
formed components of Y(t), I(t) and Q(t). The resultant 
source rate is a fixed rate of approximately 24 Megabits/ 
sec . 

Mathematical models and data reliability analyses 
are also provided for the above video data compression 
techniques transmitted over a channel encoded Gaussian 
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• channel. It is shown that substantial gains can be 
achieved by the combination of video source and channel 
coding. 
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1.0 


Introduction 


This report presents the results of a study on 
digital TV processing suitable for Space Shuttle communi- 
cation purposes. This TV communication system is divided 
into two parts: uplink and downlink transmissions. A 

monochrome video source coding utilizing two-dimensional 
Hadamard transformation and motion detection methods is 
used for the uplink TV transmission, where the data rate 
is very low. A color video source coding technique utiliz- 
ing a two-dimensional Hadamard transform is used for the 
downlink TV transmission. The source data rates are 1 mega- 
bit per second for the uplink and approximately 24 megabits 
per second for the downlink. 

1. 1 Background 

In the past, due to hardware considerations, tele- 
vision communication for space flight systems has resorted 
exclusively to analog processing and transmission techniques. 
With the advances in digital communication techniques, in 
particular the recent low dissipation mass memory technologies, 
digital television transmission offers several potential ad- 
vantages over analog transmission, such as, increase in data 
link efficiency, higher picture reliability, less hardware 
complexity, elimination of bulky transmitting antenas, ease 
of multiplexing with other digital data, etc. An efficient 
video source coding technique used in conjunction with an 
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efficient channel encoding technique can provide signifi- 
cant performance advantages over the analog transmission 
system. It should be emphasized that no universally optimum 
video data compression schemes exist for arbitrary systems. 
Generally speaking, a higher data compression ratio requires 
more complex hardware. Thus, an optimal choice of video 
source encoder-decoder often depends on the environment 
of the system under consideration. Some of the constraints 
and obvious hardware considerations for Space Shuttle TV 
communications are given in the following paragraphs. 

1,2 Basic System Constraints and Hardware Considerations 

The purpose of the Space Shuttle TV communications is 
to provide bi-directional transmission of images between a 
ground station and a space vehicle. The data transmitted 
is linked via the TDRS satellite. The "uplink" transmission 
is defined as the transmission from a ground station to a 
space vehicle via the TDRS. The "downlink" transmission is 
likewise defined as the data transmission from a space vehicle 
to a ground station via the TDRS. The basic constraints for. 
the source data rates are given as follows: 

1) Uplink Transmission: 1 Megabit per second. 

2) Downlink Transmission: 50 Megabits per second. 

The TV signals transmitted during the uplink mode 

consists primarily of scenes that are highly stationary. 

Due to the extremely low transmission data rate and low 
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rate of object displacements, the major recommendation of 
this study is to implement a motion detection algorithm that 
utilizes the advantages of data redundancies resulting from 
the stationary objects. Only monochrome TV signals are 
transmitted in this mode. 

The TV signal transmitted in the downlink mode con- 
sists of high quality video information gathered by the 
space vehicle. Color, fine picture detail, and motion 
quality are important considerations in this case. 

In the design of the Space Shuttle TV communications 
system, it is important to differentiate the hardware en- 
vironments where the encoders and decoders are located. 

For the space vehicle, where one end of the communications 
links is located ^ is obviously limited in terms of hardware 
wieght, power comsumption and hardware size, whereas in the 
ground station, these constraints can be greatly relaxed. 

Thus, in this report, special considerations are given to 
the design of the video source encoder for the downlink 
transmission, and the video source decoder for the uplink 
transmission . 

1.3 Scope of this Report 

This report will provide viable video data compression 
schemes for both the uplink and the downlink TV transmissions. 
This study is basically divided into three parts: (1) two- 

dimensional monochrome video data compression, (2) mono- 
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chrome video data compression using motion detection method, 
and (3) color video data compression method. The two-dimen- 
sional monochrome data compression method has been studied 
and simulated in previous LINKABIT work entitled "Study 
of Efficient Video Compression Algorithm for Space Shuttle 
Applications," Final Report [1]. The monochrome video data 
compression using motion detection method has been computer 
simulated on the LIM Video Controller for subjective quality 
observations. 
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.2.0 Monochrome Video Data Compression 

The art of video data compression is based exclusively 
upon the following two facts: 

FACT 1. Among all possible two-dimensional still pic- 
tures only relatively few can be classified as recognizable 
to the "average” observers. The redundancy of the set of 
recognizable pictures mainly occurs as statistical correla- 
tions of video signals between adjacent sampling points. 

Other redundancies may occur in the form of psychovisual 
effects.' For example, consider two frames of video signals, 
each of which is a reproduction of the same scene at diffe- 
rent light illximinations . From the signal point of view, 
these two frames may be quite distinct, yet from the "con- 
stant brightness" phenomenon we know, from the vision point 
of view, these two frames are identical in detail with only 
contrast and brightness differences. Psychovisual phenomena 
produce the type of redundancies that two or more distinct 
frames of video signals may be interpreted by the average 
viewer as identical. These redundancies are deterministic 
in the sense that they can be applied whenever the situation 
arises. This should be distinguished from the statistical 
redundancy, used most often in two-dimensional video data 
compression schemes, obtained empirically by classifying 
a picture as recognizable if a number of selected viewers 
agree so. From video data compression point of view, these 
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statistics together with the psychovisual phenomena should 
be utilized to obtain a set of recognizable pictures in the 
sense of least information entropy. We shall refer to the 
resultant redundancy of this set as the spatial statistical 
redundancy of the "set of recognizable pictures." Other 
useful psychovisual phenomena are: 

a) The logarithmic response of the visual system, 

b) The human eyes are insensitive to continuous 
small variations of greylevels, but are greatly sensitive 
to abrupt changes of greylevels; this sensitiveness • occurs 
in the form of awareness only, as opposed to the estimation 
of exact grey level changes. 

c) The phenomenon of ^'tritanopia," also known as "two- 
color vision," by which the eyes tend to confuse, for small 
colored regions, the color of purple, bluish grey, and 
greenish yellow, yet exhibit good accuity for orange red 

and cyan. 

d) At normal observation distance, the spatial res- 
ponse of the eyes decreases as the spatial frequency, (i.e., 
a pattern of alternate light and dark levels) , increases, 

FACT 2. Similarly, among the set of possible sequences 
of recognizable pictures, only relatively few can be classi- 
fied as comprehensible to the "average" observers. Picture 
sequences containing fast object movements, unrelated contents, 
irratic change of intensity, and many others are examples of 
incomprehensibility. Thus, among the set of recognizable 
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picture sequences, redundancy occurs not only two-dimen- 
sionally as spatial statistical redundancy, but also in 
the form of time statistical redundancy. 

The statistical redundancies mentioned above enable 
a video data compressor to encode only the information 
pertinent to the set of recognizable picture sequences. 
Accordingly, an efficient video data compressor is an 
"intelligent" machine that is capable of identifying every 
psychovisual phenomenon. Such a machine must inevitably 
be complex in structure and generally not suitable for a 
limited environment such as a space flight application. 

The objective of this study is to provide a feasible scheme 
that is simple, in terms of hardware implementation, and 
to provide reasonable data compression ratio. 

We shall begin with a study of monochrome TV signal 
data compression. It is directly applicable to the uplink 
TV communication system of the Space Shuttle program. Since 
color information can be most efficiently transmitted by 
transforming the three base color components into the 
monochrome brightness and chrominance plane, the monochrome 
technique will be equally essential to the color TV data 
compression. The monochrome TV data compression is basical- 
ly divided into two major steps; two-dimensional data com- 
pression and motion detection. The former utilizes solely 
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the spatial statistical redundancy of the set of recogniz- 
able pictures. Additional properties due to the time sta- 
tistical redundancy are incorporated in the motion detection 
technique. 

2.1 Two-Dimensional Monochrome Video Data Compression 

Among the many monochrome video data compression 
techniques, the most feasible one was originally investi- 
gated by Landau and Slepian [2]. It uses exclusively the 
property of spatial statistical correlation between video 
signals of adjacent elements. Time statistical correla- 
tion and other two-dimensional psychovisual phenomena are 
not invoked; consequently the resultant achievable compres- 
sion ratio is not very high.. Due to its importance to the 
technique proposed by this study, a refined version is given 
in the "Study of Efficient Video Compression Algorithms 
for Shuttle Applications," Final Report [1], which is 
summarized as follows: 

We may consider each frame of video signal as being 
sampled by an A-to-D converter. The A-to-D converter 
samples uniformly across each horizontal line. The samp- 
ling rate of the A-to-D converter operates at a frequency 
above the Nyquist frequency of the desired video bandwidth. 

In addition, each sampling point is linearly approximated 
by a sufficient number of information bits so that no degrada- 
tion due to false image contouring is visible. Using 512 
samples per horizontal line and 8 information bits per sample 
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seems sufficient for this purpose. Since each frame con- 
tains 480 visible horizontal lines (240 lines per field) , we 
can represent each frame of the digitized video signal as a 
lattice of 480 x 512 sampling points each of which has integer 
representation between 0 and 255. Using this representation, 
there are (256)"^®^^^^^ possible pictures. Among these, most 
are not recognizable to an average observer. Considering 
the subset of those which are recognizable to an average 
observer, one of the most obvious properties of this subset 
is the statistical correlations between adjacent lattice 
points. It is clear that the statistical correlations bet- 
ween two different sampling points decreases as their geo- 
metrical distance increases. One method for using the ad- 
vantage of this spatial statistical correlation is to parti- 
tion the lattice into an aggregate of subpictures, where 
in each subpicture the statistical correlations between any 
two sampling points are high. To minimize the computational 
complexity, the selection of the shape of the subpictures 
must be uniform. One of the intuitive choices is the shape 
of a rectangle. Let the subpicture be a rectangle of size 
m X n (m vertically and n horizontally) , where m and n are 
divisors of 480 and 512, respectively. (Note, from an equi- 
distance (between boundaries) point of view, a hexagonal 
mosaic-like pattern should be optimal; however, this will 
induce unwanted complications in the frame edges.) Each 
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subpicture can be considered as an m x n dimensional random 


vector: 


y = 


Xii, X^2 • • • ^in 

^21' ^22 * * • ^2n 



( 2 . 1 ) 


One can form the ensemble of recognizable subpictures 
as the aggregate of those members extracted from the set of 
recognizable pictures. Data compression, due to the statis- 
tical redundancy of this set, can be achieved by statistical 
analysis. In the least mean square error sense, this corres- 
ponds to the Karhunen-Loeve procedure , which involves the con- 
struction of an orthonormal basis that diagonalizes the co- 
variance ntatrix of (2.1): 

E[Y»y'^] = f^Y'Y^ PY(w)d(D (mn x mn matrix) (2.2) 

where is the sample space consisting of 256^ elements, and 
Py(o)) is the joint probability density function of Y. Bit 
rate reduction is then obtained by quantizing or discarding 
data components, in the transformed coordinates, that have 
low variances. Since visual sensation, where the ultimate 
judgement of the reproduced picture lies, is quite different 
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from the expected mean square error criterion, other types of 
orthonormal transformations were sought. In particular, 
Hadamard transformation was judged to be superior. In a 
Hadamard transformation, the basis vectors are obtained from 
the row vectors of a Hadamard matrix. An n x n integral 
matrix, H, is Hadamard of order n, if 

h'^H = nl, (2.3) 

T 

where H is the transpose of H, and I is the n x n identity 
matrix. Hadamard matrices only occur for orders n which 
satisfy the condition: 

n=l, 2orn=0 mod (4) . 

In particular, the Hadamard matrix of order 2 can be obtained 
easily by the k-th tensor product of H 2 : 

H = H2®H2(x) . . . 0H2 (2.4) 

2 V ^ / 

k terras 

where (2) is the standard tensor product notation, and 

.( j _;) 

^1 ^2 

Hence, when m - 2 and n = 2 , the basis vectors of the 

Hadamard transfomation can be obtained from the row vectors 
of H K]^+K 2 ' turn is obtained as the + K 2 tensor 

product of H 2 . The Hadamard matrix obtained this way has 
entries +1 or -1; therefore, the orthonormal basis can be 
obtained by dividing each row vector by the constant 
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Let us denote the basis vectors thus obtained by 



1 — X f . • • * f 

3 ~X f • • • / n • 


In particular, the first basis vector has the form; 



m X n vector 


The remaining basis vectors have the form of 1/^iin multiplied 
by an m X n vector, half of whose entries have value +1 and 
the remaining half have value -1. The components of Y in 


the Hadamard coordinate can be obtained as follows; 

m n 



i=l j=l 


( 2 . 6 ) 



(2.7) 


( 2 . 8 ) 


The remaining ' s have the form of l/>^mn multiplied by 

the difference of two terms, each of which is the sum of 

half of the X. .'s. C, , is also referred to as the d.c. 
ij 11 

component of the subpicture for the reason that it is a 
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constant multiple of the sum of video signals of the sample 
points within the subpicture. 

Conversely, given a vector, Z, in the Hadamard basis, 
the corresponding subpicture Y' can be retrieved by the in- 
verse transformation: 




(2.9) 


^ml' ^m2' * * * ^mn 


Where 


xr . = z • b, . 


Equivalently, the Hadamard transformation can be 
considered as 

iCii' • • • ^in 


Z = 




mn 


HY 


( 2 . 10 ) 


where H is the normalized Hadamard matrix of order mn, 
with 

T 

H * H = I. 


Statistical analysis can be applied to the vector Z. 
Using subpicture size of 4 x 4, empirical analysis has re- 
vealed that the variance of is the highest and is in 

excess of 10 to 1 in ratio to the next highest variance [2] . 
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The remaining components have relatively small variances. 

By coarsely quantizing these ^ ' s close to their mean 

values (note; e[c. .1 = 0 for or j?^l) , due to their 

*■ 1 j 

small variances, the resultant expected mean square error 
of the reconstructed video signal using these approxima- 
tions should be small. This method though theoretically 
sound, yet, due to the fact that the visual sensation 
is quite different from the mean square error criterion, 
to improve this encoding method it requires an adequate 
selection of quantizations. In [1] the psychovisual 
property that the human eyes possess a logarithmic response 
was used to obtain a set of quantizations. The logarithmic 
quantization is chosen as follows. Let the number of repre- 
sentation values be + N 2 + 1, corresponding to positive 
values and N 2 negative ones. Let the resultant code cor- 
responds to the set 


{-N2, -N 2 +I, . . . , -1, 0, 1, 2, . . ., N^-1, N^}. 

Then the inverse quantization values are chosen as follows: 

= -A^(e“®l^ - 1) for k = 0, -1, ..., -N 2 

B k ^ ^ ‘ ^ 

and = ^ 2(0 2 - 1) for k=0, 1, 2, ...,N^ 


and the outpoints are chosen as the + N 2 adjacent arith- 
metic means of the sequence: 




2+1' 


0, V^, 



}. 


This quantization scheme allows the representation of zero. 
In addition, errors are allowed to increase exponentially. 
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The constants and (all positive real) deter- 

mine the graininess and the maximum (or minimum) value of 
the quantization. Thus, quantization for a particular 
component is determined once the number of levels is allo- 
cated. 

In addition, the d.c. component is encoded using the 
DPCM method, which has the following advantages: 

a) Smooth transition of greylevels between adjacent 

subpictures, which is characterized by small video 

amplitude variations, can be achieved with fewer 

bits than that would otherwise be required. The 

graininess or smallest quantum jump is given by 
B 

A^(e 1 - 1) for - (i.e. light to dark) 

and (2.12) 

A 2 (e ^2 - 1) for + (i.e. dark to light) 

b) When the greylevel transition between adjacent 
subpictures is high, this is approximated logarith- 
mically by the quantizer. Due to the logarithmic 
response of the human visual system, the corres- 
ponding visual sensation error is low. 

c) The logarithmic DPCM coding method can correct 
a step function to within the graininess given by 

(2.12) in at most M steps, where M is a logarithmic 
function of the amplitude of the step function and 
the graininess. This is illustrated in Figure 2.1. 
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Amplitude 



Figure 2^1. Typical Response of Logarithmic DPCM to Step Error 


The non-d.c. components are not DPCM encoded for 
the reason that these components are generally uncorrelated 
between adjacent subpictures, and the fact that DPCM is 
inherently sensitive to channel noises. 

The quantizations for various Hadamard components, 
(using a 4 x 4 subpicture size, see Figure 2.2), are given 
in Table 2.1 where 

is quantized by 32 levels (i.e. 5 bits) 

C ^2 ^21 quantized by 7 levels 

^31 quantized by 15 levels 

and are quantized by 9 levels 

^ 12 ' ^13 ^14 (li^®wise, C 2 j^, and C^^) 

share 10 information bits. 

C 33 , and quantized by 5 levels, and 

they share 7 information bits. 

The overall bit requirement for the two-dimensionally 
compressed picture is 32 bits per subpicture, or 2 bits per 
picture element. 
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DPCM Logarithmically by 5 Bits (Range: 0 < £ 1024) 



761 

562 

414 

304 

223 

163 

118 

85 

61 

42 

29 

19 

11 

6 

2 

-2 

-6 

-11 

-18 

-27 

-39 

-56 

-78 

-107 


Representative Value 


876 

646 

476 

350 

258 

188 

138 

100 

71 

50 

36 

23 

15 

8 

4 

0 

-4 

-8 

-14 

-22 

-32 

-47 

-65 

-91 


Table- 2.1. Quantization Table 


DPCM Logarithmically by 5 Bits (Continued) 



-124 

-147 

-169 

-199 

-229 

-209 

-310 

-363 

-417 

-488 

-560 

-655 

-750 

-877 

-1000 


Non-d.c. components have range: -512 to +512 


and Quantized by 15 Levels 

Cutgoir^ Regresentat^e_J^l^ 


+150 


+122 



+94 

+ 76 



+59 

+47 



+35 

+28 



+20 

+15 



+10 

+7 



+4 

+2 



0 


Table -2.1. 


Quantization Table (Continued) 


and Quantized by 9 Levels 


+53 

+26 

+11 

+3 


+70 

+36 

+17 

+6 


0 


^21* Quantized by 7 Levels 
Outpoints Representative Value 


+43 

+17 

+4 


+60 
+26 
+ 9 
0 


0^2# and Quantized by 5 Levels 


Outpoints 

+33 

+8 



Table Quantization Table (Continued) 
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2.2 


Variable Rate, Motion Detecting Video Data 
Compression 

Monochrome video data compression, without utilizing 
the time statistical redundancy over the ensemble of 
recognizable picture sequences, cannot operate below 2 
bits per pel other than by sacrificing the picture quality sub 
stantially. Fixed rate video data compression, using frame- 
to-frame differencing, was studied in [1] , where a compressed 
data rate of 1 bit per pel was achieved. To achieve a 
higher data compression ratio, it is necessary to use the 
fact that most objects in a recognizable picture sequence 
are stationary. Further, the human visual system tends to 
lose the object detail if the rate of displacement is high, 
and consequently its reproducing fidelity criterion can be 
relaxed. From the coding point of view, efficiency can be 
greatly increased if the nximber of information bits devoted 
to stationary objects is minimized. To achieve this, 
frame memories are required both at the encoder and the de- 
coder. These are used to preserve the information of the 
preceding frame. At the decoder, the frame memory is used 
for display purposes. While at the encoder, it is used 
to provide decision whether to transmit updating data or not. 
Ideally, the decision to transmit the updating information 
should coincide with the object movement or mutation of the 
reproducing scene. Such u decision can be categorized as 


"motion detection." 


The most primitive form of the video compression 
algorithm using motion detection is the "variable length, 
frame-to-frame difference" algorithm studied by Candy et,al 
[ 3 ], where the digitized video signal of each sampling point 
is compared with its predecessor. If the difference exceeds 
a certain pre-chosen threshold, then the updating information 
is transmitted. This scheme does not utilize the advantage 
of spatial statistical redundancy of the set of recognizable 
pictures. In addition, due to the large number of sampling 
points per frame, to implement this scheme, a large number 
of information bits must be dissipated for the "position 
markers." Position markers are subcodewords used to in- 
struct the encoder and decoder where to update the informa- 
tion. The resultant coding efficiency is not very attractive 
for reasonably active picture sequences . 

We shall describe a new motion detection algorithm 
that seems suitable for picture communication purposes, in 
particular, where a high data compression ratio is required 
for reasonably stationary picture sequences. 

In the human visual system, under a well-lit situa- 
tion, object movements or mutations are characterized by 
light intensity changes within small regions of the frame 
(for color picture sequences, chrominance change should also 
be taken into account) . Hence, it is reasonable to partition 
the frame into disjoint regions. Again, as in the two-dimen- 
sional data compression case, the rectangular shape consti- 
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tutes a reasonable choice. Each region, which will be 
referred to as a "motion region," represents the minimum 
size of detectable object mutation. 

Smaller motion region corresponds to finer motion 
resolvability, but at the expense of more information bits 
which must be used for addressing information. An object 
mutation occurring in a motion region can be defined as 
the deviation of light intensity from its predecessor 
beyond a pre-chosen threshold. (Ideally the motion detection 
should take into account all psychovisual phenomena, such as 
constant brightness, so that certain change situations, such 
as uniform illumination changes, will not be interpreted as 
motion. Again, such an "intelligent" motion detecting machine 
is inevitably complex) . 

Let the motion region be a rectangle of m x n samples : 


Y = 


Xij_, . . . 


X 1 , ... X--, 

ml mn 


m X n vector 


( 2 . 13 ) 


Then the light intensity of the region, normally 

defined as energy radiated per unit area, is given by 

m n 

i=l j=l 


( 2 . 14 ) 


— • Y, 

mn 


where k is a constant. 
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Although (2.14) is a simple arithmetic function, 
it is suspected that the effective intensity can be greatly 
simplified. This follows from the reasoning that, at normal 
observation distance from the displaying picture tube, the 
visual response to intensity and detail diminishes towards 
higher spatial frequencies. This leads us to suspect that 
the best way to estimate the intensity is to decompose (2.14) 
spatially (in particular, by Hadamard transformation) and 
note that if Y is transformed into Hadamard components: 


I . . . C 


In 


= HY 


C -1 , . . . c 

ml .. mn 


+ 

m X n vector 


then 


m n 


T 

Y *Y = Z 


= EE 




(2.15) 


i=l j=l 


Consequently, it is natural to decompose I (Y) into 

m n 


I(Y) = 


I . . (Y) 

13 


(2.16) 


i=l j~l 


lij (Y) is the intensity contributed by the spatial frequency 
corresponding to the pattern given by ^ . It is suspected 
that the intensity is mainly concentrated on the d.c. compor- 
nent. This can be shown as follows: 

The expected intensity for the motion region is 


25 


where 


E[I(Y)] = /jjKY) Py((i))da) 
m n 

=EE 

i=l j=l 




where C^j 


mn 


[var(C^j) + C 


ID 


= mean value of C^j 
= 0 for i 5 <^l or 


can be estimated by assuming that uniformly dis- 
tributes over the range of i.e. 

0 < C, , < E = 255»^^in . 

— 11 — m 

Front the assumption of uniform distribution, we have 


^11 " 

var (C^ 3 ^) = E^/3 - E=/4 = E^/12 


Thus , we have 




E[I..(Y)1 

ID 12mn ID m 

where = var (C^j)/var (C^^) . 


i?^l or j?^l, 


For the 4x4 Hadamard square, when the statistics are 
taken over the ensemble of recognizable subpictures, using the 
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typical statistics obtained by Landau and Slepian, i.e., 


Table 2.2, 


(if j) 


(i, j) 


(1,2) 

.051 

(1,4) 

.035 

(2,1) 

.048 

(4,1) 

.038 

(2,2) 

.014 

(2,4) 

C.016 

(1,3) 

.098 

(4,2) 

.015 

(3,1) 

.087 

(3,4) 

.024 

(2,3) 

.022 

(4,3) 

.024 

(3,2) 

.02 

(4,4) 

.019 

(3,3) 

.034 




Table 2 . 2 


we have E [I(Y)3 = E[I^ 3 ^(Y)] + E (Y) ] , 

where the second term is the total expected intensity con- 
tributed by the non-d.c. components, which is about 12% of 
the total expected intensity. Note, however, that if we 
assume the distribution of concentrates about its mean, 
as in the case of a normal video reproduction, then the ex- 
pected intensity will further concentrate on the d.c. compo 
nent. It is also interesting to note that intensity estima 
tion using ^31 contributes 93% of the total 

expected intensity. 
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The above leads us to believe that the intensity 
approximation using the d.c. component (or few additional 
components) is sufficient for motion detection purposes. 
Motion detection in the Hadamard coordinates has the addi- 
tional advantage that the two-dimensional data compression 
obtained previously can be applied directly. 

Thus, our motion detection algorithm consists of: 

1) The size of motion region is selected to be 
one or an integral multiple of Hadamard subpictures, 
If the motion region contains L subpictures, then 
the effective intensity of the motion region is 
defined as L 


X = iE 


(j) 


where is the d.c. component of the j-th 

subpicture. 

2) A reference frame is transmitted using the two- 
dimensional compression technique given in 2.1. 

3) At the encoder, a frame of memory is used to 
store the infomnation of the d.c. components. 

The decoder frame memory has sufficient storage 
for the d.c. components and the encoded non-d.c. 
components . 

4) For the subsequent frames, the intensity of each 
motion region is compared with its predecessor. When 
the difference exceeds a pre-chosen threshold, the 
difference of the d.c. component is quantized by 5 
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' information bits (using the same quantization 
table as in (1)). This along with the new com- 
pressed non-d.c. components and an address code 
are transmitted. Simultaneously, the frame 
memory of the encoder is updated. 

5) At the receiving end, the address code re~ 
ceived directs the decoder to the portion of the 
frame memory where updating is to be performed. 

6) Visual display is accomplished by inverse 
quantization lookup, inverse Hadamard trans- 
formation, and D-to-A conversion of the data 
stored in the decoder frame memory. 

Experiments using this algorithm have been performed 
on the LIM Video Controller at LINKABIT. In particular, 
the motion region was chosen to be one subpicture; the re- 
sultant motion detection simply reduces to a comparison of 
the d.c. components. The overall subjective picture quality 
was judged to be good. In particular, stationary objects 
were faithfully reproduced. This feature makes it very 
attractive for both normal and inactive scenes. 

Coding using the motion detection method requires 
position mark ,s to instruct both the encoder and decoder 
where about to update the data. The position markers should 
be capable of addressing every motion region. (Note: due to 

the fact that the nraiber of motion regions is substantially 
less than that of sampling points, the number of information 


bits dissipated in the position marker is greatly reduced) . 
There are many different ways to implement the position 
markers. The following is a list of possible encoding 
schemes : 

1) The most primitive scheme is to each updating 

information a code of constant length is assigned, 

which consists of two portions; 

14 bits 32 bits 

j-<- address updating content -*^1 

There are (512/4) x (480/4) = 128 x 120 motion 
regions (in the sequel, we have assumed one 
motion region = 1 Hadamard square, i.e., 4x4 
samples) , which requires 14 information bits. 

The overall compressed rate operates between 
the following limits: 

14 

0 < compressed rate 5 2 + bits per pel. 

When properly synchronized by a frame 
synchronizing code. this method has the effect 
of being less sensitive to channel noises. 

If, in addition, a horizontal synchronizing 
code is provided for each horizontal line group, 
then the position marker requires only 7 informa- 
tion bits. The corresponding compressed rate 
operates between the following limits ; 

A bit/pel £ compressed rate £ 2 + jg ^ bits/pel 
where A is the efficiency loss due to the synchronizing 


codes. 


2) Updating information can be coded using a variable 
length subcodeword, which contains the address informa- 
tion, a set of updating data, representing a number 
of consecutive motion regions that need updating. 

The length of this subcodeword can be determined 
either by an end marker, a continuation marker, or a 
number represented by a fixed number of information 
bits: 

a) 

32 bit 32 bit 32 bit end 

address data data data 4- 1 marker | 

b) 

32 bit 32 bit 32 bit 

address I data *>'lc|-‘- data data jc] 

where c is the continuation marker represented by a 

fixed number of information bits. 

c) 

!■<- address-^ I N data | ... } data | 



N data 


This method is particularly efficient for picture 
-sequences with fairly active scenes. However, it is 
quite sensitive to channel noises. For example, erro- 
neous diciphering of the endmarker , or continuation 
marker, or the number of consecutive data, N, will 
result in errors for the subsequent decoding. In 
this case, a synchronizing code, preferably for each 
horizontal group, must be used to minimize the accu- 
mulative effect of channel error to limited areas. 
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3) Each motion region is coded by a motion indicator 
using one information bit. For example, 1 represents 
motion, and 0, no motion. If 1 occurs, then it is 
followed by the 32 bits of updating information. The 
corresponding compressed bit error rate versus the 
amount of motion detected is illustrated as curve A 
of Figure 2.3. This method is also sensitive to 
channel noises, and a provision ;of line group syn- 
chronizing code is desirable. 

4) The motion indicator can be implanted in the sub- 
codeword of the content information allocated for the 
d.c. component. The d.c. component is coded by 5 in- 
formation bits. During motion detection, the difference 
of the d.c. components are quantized and transmitted. 

If the difference. A, has absolute value greater than 
the prechosen threshold, then a subcodeword of length 
27 bits, which contains the updating information for 
non-d.c. components, is also transmitted. Otherwise, 
only the quantized difference is transmitted. 

5 bits 27 bits 


value A 

subcodeword for non-d.c. components ^ 

^ d . c . component ’ 

‘send only if U|> threshold 


In this case, the compressed bit rate versus the 
amount of motion detected is illustrated as curve 
B of Figure 2.3. 

Again, this method is sensitive to channel noise 
and a provision of line group synchronizing code is 
desirable. 
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COMPRESSED BIT RATE (bits per pel) 



. Figure 2.3. Compressed Bit Rate vs. Motion Detected 
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In addition, due to the fact that the d.c. 
components are transmitted in differential mode, 
reference frames should be sent every so often 
to reduce the accumulative effect of channel errors. 
(This cumulative error effect only affects each 
motion region individually.) 
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2,3 Transforming the Variable Rate Algorithm into a Fixed 
Rate Algorithm 

The motion detection algorithm described in the above 
section has the distinct disadvantage of being variable in 
coding length, and thus dependent entirely on the motion 
activity. To transmit the data, a rate buffer must be used. 
Since, under arbitrary situations, the duration of active 
scenes may vary widely, a very large buffer may be needed 
to reproduce every frame of an arbitrary picture sequence. 
This report proposes an adaptive rate buffer scheme that 
seems most suitable for the up-link Space Shuttle TV comuni- 
cation system. This scheme is now described: 

At the encoder, there are two identical rate buffers. 
Each buffer has sufficient storage to store one field of 
encoded data (i.e., both updating data and position 
markers) . These buffers are used to store field 1 and field 
2 of the encoded data. Data are transmitted in blocks of 
one field. The buffers can be in one of the following 
modes : 

1) Active Mode: The buffer is being used to 

transmit the data, 

2) Dummy Mode: The buffer is being filled by the 

source data, 

3) Inactive Mode: No data are being taken or 

filled. 
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The field 1 and field 2 buffers alternate between 
active and dummy modes with a possible inactive mode in 
between. Thus, under normal conditions, one of the buffers 
whose contents are being transmitted while the other is 
being filled. Due to the variable length of the data 
queues, the following events may occur; 

1) The d\immy is filled but the active buffer is 
not emptied. In this case, no source data is 
accepted. Furthermore, the video data compression 
process is stopped. This is done by halting the 
A-to-D converter, so that the frame memory of the 
source encoder (which is used to store the d.c. 
components) contains only information of the last 
processed fields. This halting condition prevails 
until the active memory is emptied; then the dummy 
buffer is switched to active mode, and the emptied 
buffer becomes the dummy. The dummy accepts only 
the data of the next field in sequence (i.e., if 
the last active buffer is field 1, then the dummy 
accepts the data of the following immediately avail- 
able field 2) . Thus, at most one frame time may 
elapse between the last da ^a taken from the active 
buffer and the first data to be entered into the 
dummy buffer. This can be done by reactivating the 
A-to-D converter and the data compression process at 
the beginning of the next field in sequence. 
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2) The active buffer is emptied, and the dummy 
is not filled (Note: it takes exactly one field 

time to complete the filling, i.e,, the input of 
the rate buffer accepts, when activated, source 
data at a real time rate) , then the active buffer 
is made inactive. Similarly, if the active buffer 
is emptied and the dummy is waiting to be filled 
(this circumstance occurs when the next field in 
sequence, described in (1), has not yet arrived), 
then no data are being taken or filled. During 
these periods, special codewords are transmitted 
representing "blank” or "waiting situation," (or 
the time can be used to transmit field synchronizing 
codewords.) 

In doing so, data blocks of alternating fields 
are transmitted such that the beginning of each data 
block always occurs at about the beginning of the 
realtime field timing. The periods in between are 
filled with blanks. Many fields may not be process- 
ed if the picture contents are very active. 

In addition, special codewords are transmitted 
at the beginning of each field to identify whether 
the data following belong to field 1, field 2, reference 
data or updating data. 

At the receiver, no additional buffer memories 
are needed. Again, the frame memory organization is 
divided into field 1 and field 2. The memories 
store the decoded d . c . components and the encoded 
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non-d.c, components. One of the field memories 
is always in "display" mode, i.e., the data are 
taken, inverse quantized, inverse Hadamard trans- 
formed, D-to-A converted and displayed on the TV 
monitor, while the other field memory is either 
being updated or filled with reference data. If 
the updating, or filling with reference data, is not 
completed at the end of the normal TV field scan, 
then the contents of the current displaying field 
are repeated for the following two fields. This 
condition prevails until the entrance of data is 
completed at the end of the TV field scan corres- 
ponding to the designated field of the displaying 
memory. Then the roles of the field memories are 
exchanged. In doing so, the TV monitor displays a 
sequence of pictures that consists of alternate 
bursts of the two fields. Each burst has odd num- 
ber length; 

Field 1 ... Field 1, ..Field 2 ... Field 2 .... 

' V 

odd number odd number 

In particular, if the source data rate is lower 
than the data transmission rate, then the burst 
lengths become 1, i.e., the displayed picture 
sequence coincides with the input picture sequence. 
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When the source data rate exceeds the transmission 
rate, the effect is that the input picture sequence 
is abridged, by discarding information in groups 
of consecutive frames, where the contents become 
active. This allows the transmission data rate to . 
catch up with the source data rate. Likewise, if the 
source data rate is less than the transmission rate, 
then blanks are transmitted to fill up the gaps 
between the source and transmission. The overall 
visual effect is that the transient details of 
object movements or mutations are abridged, which 
is not very objectionable from the picture compre- 
hensibility point of view. 

This method is well suited for the space shuttle 
uplink TV communications system for the following 
reasons : 

a) No additional rate buffer memories are required 
at the decoder, which is located at the space vehicle 
where the envionment is limited. Consequently, the 
augmentation of hardware complexity is minimal. 

b) The rate buffer is only needed at the encoder, 
which is located at the ground statloii, v^here the 
increase in hardware complexity does not affect the 
feasibility of the system. 

c) Due to the nature of the very low information 
rate for the uplink TV communication, the advantage of 


time statistical redundancy (in the form of sta- 
tionary objects) is utilized maximally. In par- 
ticular, when the source rate is less than the 
transmission rate (for 1 Mbits per second uplink 
rate, this corresponds roughly to motion detected 
over 6% of the frame) , the picture sequence is 
reproduced without abridging. When an inevitably 
large amount of object motion or mutation occurs 
this is remedied by abridging the input sequence. 

Note the maximal abridging effect for 1 Mbits per 
second corresponds to displaying the alternate 
fields at approximately 1/4 second intervals. This 
corresponds to 100% motion detected. 

The functional block diagrams in the uplink video 
source encoder and decoder are illustrated in Figures 2.4 
and 2.5. The field rate buffers mentioned above, due to 
the random length nature of motion detected, are implemented 
by random access memories. As mentioned previously, these 
accept data (i.e., both encoded data and/or position markers) 
at a real time rate. 

The rate buffer described above is adaptive in the 
sense that its input is controlled by its output. This is 
shown in Figure 2.5, where an "end-of-f ield” decoder is used 
to sense the end df information taken from the buffer and to 
signal the controller to swap the roles of the rate buffers. 
This decoding function does not necessarily provided codeword 
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deciphering, in the usual sense, where complicated hardware 
may be required. For example, this can be accomplished by 
using a content counter for each field rate buffer. The 
counters keep track of the amount of data entered during the 
updating mode. In the active mode, the counters keep track 
of the amount of data transmitted. A typical hardware imple- 
mentation for the rate buffer is shown in Figure 2.6, where 
a random access memory of size 299.52 Kbits is used. The 
memory is divided into two parts : data memory and motion 

address memory (special synchoronizing codewords are used 
for each horizontal line group) . During updating (or enter- 
ing reference data) , data are entered in blocks of 32 bits 
for the compressed data and 7 bits for the motion address 
memory (used only in the updating mode) . When- the data com- 
pressor detects the first motion, the motion region address 
is entered followed by the updating data. Simultaneously, 
the buffer counter is incremented, (at the beginning of 
the updating mode, it is reset to 0) . If motion is 
detected on the motion region of the next successive 
address, then a continuation marker is entered to the 
motion address memory. In other words, motion address 
data is entered only if the motion region is the leader 
of a burst of consecutive updating motion regions . Data 
enters the rate buffer sequentially. During the active 
mode, data is taken out bit by bit at the transmission 
rate. If a continuation marker is encountered, then the 
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address code is ignored. If not, then an end -marker is trans- 
mitted and this is followed either by the location address 
of updating motion regions, or by the special end-of-line 
synchronizing codeword. The buffer counter is decremented 
each time a block of data is transmitted. End of active mode 
is signalled by the 0 content of the buffer counter. 




3.0 


Color Video Source Coding 


We shall begin this section with the fundamental 
principles of color TV communication and then proceed to 
devise a feasible digital color video data compressor 
suitable for the down-link space shuttle TV communication 
system. Color picture reproduction relies mainly on the 
principle of three primary color decompositions, where the 
picture is sampled through a red, green, and blue filter. 
Hence, one may treat a color signal as a vector of three 
separate monochrome signals corresponding to the red, green, 
and blue contents of the object picture. If the color signal 
is transmitted in this manner, it would require three times 
the banvlwidth needed for the monochrome transmission. As 
in the monochrome case, statistical redundancies exist not 
only as the spatial or time statistical correlations for 
each color, but also in the form of intra-color redundancy. 
Data compression can be accomplished in three separate steps. 
First, the bandwidth can be reduced due to the intra-color 
redundancy, then the spatial and time statistical data 
reduction described in Section 2.0 can be applied directly. 
The intra-color statistical redundancy can be described by 
considering the random vector, 

S(t) = |R(t) , G(t) , B(t)j'^, 
where R(t) = red video signal 

G(t) = green video signal 
B(t) = blue video signal, 
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as a continuous random process. Bandwidth reduction, in 
the sense of the least expected mean square error crite- 
rion, can be obtained by applying the Karhunen-Loeve proce- 
dure to the set of recognizable color pictures. This nor- 
mally results in a linear transformation of the original 
color signal vector, 

/Ki(t)\ /R(t)\ 

I K2(t) I = G(t) j 

\K3(t)/ \B(t)/ 

where M is a 3 x 3 matrix that diagonalizes the co- 
variancd matrix 

E[S(t)'^-S(t)]. 

This Karhunen-Loeve procedure depends mainly on empirical 
analysis. Psychovisual phenomena are not invoked. Other 
transformations were sought. One of the methods used in 
the standard NTSC color TV transmission is to transform the 
signal vector S (t) into the coordinates that consist of 
the monochrome brightness and two vectors that lie on the 
chrominance plane, see Figure 3.1. 

( Y(t)V /R(t)\ 

I(t) j = M I G(t) |, 

Q(t)/ \B(t)/ 

where . 

M = /.3 .59 .11 

1.6 -.28 -.32 

\.21 -.52 .31 
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(3.1) 


(3.2) 



Y(t) corresponds to the monochrome brightness, I(t) corres- 
ponds to the vector in the "orange red-cyan” direction in 
the chrominance plane, and Q(t) corresponds to the vector 
in the "magenta -green" direction. This choice of coordi- 
nates has the following advantages; 

1) Black and white monochrome video signal can be 
readily formed by simply dropping the Q and I com- 
ponents, 

2) Psychovisual phenomenon that the human eyes 
sense only black and white at very low liiminosity, 

3) For small color areas, the human eyes exhibit 
"tritanopia" (or two color vision) . This corresponds 
to the decrease of spatial visual sensitivity in the 
Q component . 

Due to these facts, color video signals can be trans- 
mitted, with reasonable picture quality, when Y(t) has a 
bandwidth of 4 MHz, I (t) has a bandwidth of 1.5 MHz, and Q(t) 
has a bandwidth of only .5 MHz. Standard commercial color 
TV signals are transmitted by modulating the chrominance 
signals, I(t) and Q(t), by a subcarrier oj ; 


M(t) = I(t) Sin (0 t + Q(t) Cos o) t 

C G 

= Cg(t) Sin (o)^t + Cj^(t) ) , 


(3.3) 


where 

Cs(t) 

and Cj^(t) 


= /iMt) + Q^(t) 

= Tan"^ Q(t)/I(t) 


(3.4) 

(3.5) 
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Cg(t) and Cj^(t) correspond, very roughly, to the constant hue 
and constant color saturation stream lines in the plane of 
chrominance. Thus, a typical commercial color TV signal can 
be expressed as 

Y(t) + M(t) = Y(t) + I(t) Sin uit + Q(t) Cos m t (3.6) 

c c 

This composite modulated signal, together with the reference 
phase of the subcarrier frequency, enables the receiver to de- 
modulate Y(t), I(t) and Q(t) . The corresponding primary sig- 
nals, R(t) , G(t) and B(t) are obtained by the inverse linear 
transformation: 

( R(t)\ /Y(t)\ 

G(t) I = M“1 I I(t) I 

B(t)/ \Q{t)/ 

The general color TV communication scheme is illustrated in 
Figure 3.2. 

3 . 1 Digital Color Video Data Compression 

Equation (3.6) enables color video information to be 
packed in a continuous analog signal. Therefore, the mono- 
chrome digital video data compression technique can be applied 
if the mean square error of the decoded signal is sufficiently 
small so that the phase and amplitude errors of the recon- 
structed signal produce tolerable visual sensational errors. 
This can be done by increasing the sampling rate of the A-to-D 
converter and assigning more information bits to the quantiza- 
tion of the Hadamard components. This method has been experi- 
mented by Enomoto and Shibata [ 4 ] using 3.75 bits per pel. 
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Figure 3.2. Modulated Composite Color Signal 
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However, this method is not recommended by this report for 
the following reasons: 

1) To resolve the phase and amplitude information 
of the modulated signal to within tolerable visual 
sensations, a substantially higher sampling frequency 
must be assigned to the A-to-D converter. The typical 
requirement in this case is in the range of 12 MHz to 
15 MHz. A-to-D converters of this kind operate in the 
range of the present state of the art. The correspond- 
ing data conversion accuracy, power consumption, weight 
and cost make them undesirable for limited environment, 
such as a space vehicle used in the Space Shuttle pro- 
gram. 

2) The corresponding Hadamard transformer and quan- 
tizer must also operate at higher rates. Consequently, 
power consumption, weight, etc., will increase. 

3) The additional analog modulation-demodulation of 
the chrominance signals will contribute additional 
errors to the overall system. 

A digitized color video data compression algorithm 
suggested by this report is given as follows; 

The three primary color video signals, R(t) , G(t) 
and B(t) arc first analog transformed into Y(t), I(t), and 
Q(t) , using simple resistor network and video inverting 
amplifiers. Among these only the monochrome brightness 


component, Y(t) , conveys most of the picture information 
and requires highest transmission bandwidth. As in Section 
2.0, 512 samples per horizontal line (approximately 8 MHz 
sampling frequency) , and 8 information bits per sample are 
sufficient for video reproduction purposes. I(t) and Q(t) 
have substantially lower bandwidth requirements (this is 
resulted mainly from the poor spatial response of the human • 
visual system to the chrominance components) , Comparatively, 
either I (t) or Q(t) has bandwidth requirements of less than 
half of that required by Y(t) . This enables us to sample I(t) 
and Q(t) at half the sampling frequency used for Y(t) while 
maintaining sufficient chrominance resolution. The reduction 
in horizontal sampling rate is mainly due to the loss of 
chrominance response of human eyes to higher spatial frequen- 
cies. This reason can be applied equally well vertically. 
Consequently, I(t) and Q(t) may share the same A-to-D con- 
verter by sampling I(t) and Q(t) at alternate horizontal 
lines. (Note: this vertical bandwidth reduction cannot be 

applied directly in the case of analog modulation -a, efftodula- 
tion described in (3.1) due to the horizontal scanning nature 
of the TV communication system) . For ripple-free operation, 
the sampling frequency for the A-to-D converter, used for 
Q(t) and I (t) is synchronized to that used for Y (t) by a 
simple frequency divider. In addition, since the chrominance 
resolution of human eyes is much less than that of monochrome 


brightness, a 6~bit resolution seems to be sufficient for 
the I(t) and Q(t) A-to-D converter. This corresponds to 
31 or more levels of color purity in the directions; white 
to cyan, white to orange red, white to green, and white to 
magenta. 

Using the above, the additional encoder front-end 
analog circuit requirements are: a 6-bit A-to-D converter, 

a resistor network, and an analog multiplexer for the I(t) 
and Q(t) signals. For limited environments, as in a space 
vehicle, this seems to be more pref err able than to compress 
the components modulated signal, (3.6). 

In doing so, the digitized color signal can be repre- 
sented by a lattice of 512 x 480 sample points for the mono- 
chrome brightness component, Y(t), and a lattice of 256 x 
240 sample points for the chrominance components, I(t) and 
Q(t). The monochrome brightness subpictures are chosen with 
size of 4 X 4 samples, and the chrominance subpictures with 
2x2 samples. Larger chrominance subpicture sizes are not 
used because the spatial statistical correlation between 
sampling points within each subpicture is only a function of 
its physical dimensions. The bandwidth reduction obtained 
by lowering the sampling rate is resulted merely from the 
psychovisual phenomena. Further, the chrominance subpictures 
are made to coincide with the monochrome brightness subpictures. 
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This is illustrated in Figure 3.3. The corresponding digi- 


tized color subpictures are represented by the following 
vectors: 

Brightness Vector 




Y = 


X 


41 


. X 


44 


Cyan-Orange Red Vector 


I = 


Kl 

^^12 

V 21 

u22 


(3.7) 


(3.8) 



(3.9) 


where x.. is the sampled brightness value at the 

coordinate of the subpicture, and the u. .*s and v. .*s are 

- ^ -i;] xj 

the sampled chrominance values. This is illustrated in 
Figure 3.3. They have integer representations in the 
following ranges : 


0 £ X ■ • £ 255 

1 J 


“ i “ij, '^ij - 


X^lfUii 


^12 


Xi3,Ui2 



^21 


^22 ^ 23^^12 ^24 

• • • 



u 


21 




u 


22 




V 


21 




'^22 



Figure 3.3. Orientations of Subpicture Sample 
Points 
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Hadamard transformations can be applied to the 


above vectors ; 


HY = 


H'l = 


and 


H'Q = 



( 3 . 10 ) 


( 3 . 11 ) 


Where H is the Hadamard transform for 4x4 subpictures / and 
H' is that for 2x2 subpictures. The transformed range is 
given as follows: 

X 255 for i^l or j^l 
X 63 

-63 < cr., cr. < 63 for or jy<^l 

The monochrome video data compression procedures 
described in Section 2.0 can be applied to HY. When two 


0 < < 4 X 255 

-2 X 255 < cT. <2 

- ij ~ 

0 < < 2 
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dimensionally compressed HY requires 2 bits per pel, or 
equivalently, 491.52 Kbits per frame. Likewise, components 
for and H"Q can be compressed using similar logarithmic 
quantization procedures. In particular, due to the low 
spatial frequency response for the magenta-green chrominance 
component, C^ 2 ' ^21 *^22 discarded. This follows 

from the reasoning that using alone, to approximate the 
Q(t) component, corresponds to sampling Q(t) at 1/4 the -saraple 
frequency for Y(t) , or, equivalently, sampling at -2 MHz. This 
is substantially higher than the Nyquist rate required for the 
500 KHz bandwidth of the Q(t) signal used in standard com- 
mercial color TV signal. Logarithmic DPCM technique should 
be applied to the ^11 components in order to extract 

maximal advantage due to the residual statistical correlation 
existing between adjacent subpictures. 

The quantization table for chrominance components 
is given in Table 3.1. The designation for the Hadamard 
components is shown in Figure 3.4. 


I O 

Cii and J^sve 23 quantization levels using 

logarithmic DPCM methods. 

I I 

^12 ^21 quantization levels. 


C 22 9 quantization leve!}.s . 


I O 

and together have 23 x 23 = 529 possible pairs 


of representative values. To encode them, it would requires 


Quantized by 23 levels. 


and 


Outpoint 


Representative Value 


+ 96 
+ 74 
± 57 
± 44 
+ 33 
t 24 
+ 17 

± 

± 7 
± 1 
+ 1 


+ 

108 

+ 

84 

+ 

64 

+ 

50 

+ 

38 

± 

28 

+ 

20 

+ 

14 

+ 

9 

+ 

5 

+ 

2 


0 


In addition^ the following outpoint pairs of 

are deleted, these occur outside the reproducible color triangle. 

(+96,-96), (+74,-96), (+57,-96), (+44,-96), (+33,-96), 

(+96,-74), (+74,-74), (+57,-74), (+44,-74). 

This enables and to share 9 information bits. 

I I 

0^2 ^21 * Quantized by 15 levels. 


Outpoints 


Representative Value 


+ 39 
+ 29 
+ 21 


+ 44 
+ 34 
+ 25 
+ 18 


TABLE 3.1. Quantization Table 



+ 15 


+ 10 


+ 6 


+ 12 


+ 8 


+ 4 


Quantized by 9 levels. 


Cutpoints 


Representative Value 


+ 26 


+ 14 


+ 9 


+ 3 


+ 32 


+ 20 


+ 12 


+ 6 


II I 

€ 32 f ^ 21 ' ^22 information bits, 


TABLE 3.1 - Quantization Table (Continued) 


1 






more than 9 information bits. However, when both I{t) 

ys 

and Q(t) have extreme negative values, the resultant 
chrominance coordinate lies well outside the reproducible 
color triangle formed with the three primary color vertices. 
Hence, by combining the representative values and deleting 
the cutpoint pairs outside the reporducible color 

triangle, as shown in Quantization Table 3.1, and 

are effectively encoded by 9 information bits. 

II I 

^12' ^21' ^22 information bits. The 

overall bit requirement for the chrominance components 
is 20 bits per subpicture. Thus, the resultant two dimen- 
sionally compressed color data requires 3.25 bits per pic- 
ture element (based upon 512 x 480 picture elements per 
frame), or equivalently, 798.72 Kbits per frame. Therefore, 
by transmitting the color pictures using only two dimen- 
sionally compressed data, the overall bit rate requirement 
is 23.9616 iMegabits per second. 

Further bit rate reduction is possible by applying a 
motion detection method similar to that described in Section 
2.2. Here, the object movement or object mutation can be 
defined as either a change of brightness intensity, or a 
shift of chrominance coordinates within each defined motion 


region. This can be done, as in Section 2.2, by comparing 

Y I 0 . . . . 

Cfi, and with their predecessors. When motion is 


detected, the information within the motion region is updated 



by transmitting the quantized differences of the d,c. 
components. Position markers described in Section 2.2 
can be applied here directly. 

Although a motion detection technique is ideal for 
minimal data rate transmission, especially when the object 
scene is relatively inactive, yet, due to the limited envi- 
ronment of a space vehicle where the color video source 
encoder locates, the motion detection method is not recom- 
mended by this report. For the reason that it requires a 
frame memory for the d.c. components of approximately 276 
Kbits, and a rate buffer of approximately 870 Kbits. Where- 
as, in using the two-dimensional data compression technique 
given above, these memories are not required. Since the 
resultant bit rate is only in the order of 24 Megabits per 
second, which easily satisfies the downlink constraint of 
50 Megabits per second, and thus it is recommended from 
hardware implementation point of view. 

The functional block diagram for the color video 
encoder and decoder is illustrated in Figures 3.5 and 3.6. 

To minimize possible accumulated decoding errors, due to 
the usage of DPCM technique, special line group and field 
synchronizing codewords are used to confine this accumula- 
tive effect to within each individual line group. 

The decoder outputs, Y(t), I(t), and Q(t) are filtered 
by low pass networks to insure smooth analog transitions. 
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Chrominance Matrix 
and Video Amps 


























































.Typically, the cutoff frequencies for Q(t) and I(t) are 1/4 
and 1/2, respectively, of that for Y(t). This normally re- 
sults in undesirable chrominance location errors due to 
different propagation delays of the filters. As in normal 
color TV reproduction, linear delay lines are used in series 
with Y (t) and I(t) to optimize the chromatic effect. 

Although experiments for the proposed color source 
coding technique have not been performed due to the lack 
of color video equipment, yet we are confident that reason- 
able color reproduction quality is achievable. This can be 
substantiated by the following response analysis of the 
proposed color source coding system. 

A 

Since the performance of the component Y(t) is 
identical to the black-and-white monochrome video data com- 
pression, its feasibility is verified experimentally. Hence, 

A A 

it is sufficient to analyze the response of I(t) and Q(t). 

From the quantizations selected for and it is easy 

to verify the following (in response to a step function of 
amplitude A) : - 

1) The steady state error for each chrominance com- 
ponent is less than ^ E.., where E„ is the maximum 
chrominance amplitude (which normally determines the 
dynamic range of the A-to-D converter) . 

2) For a step function amplitude, A ^ j E^^, the decoder 
can approximate to within 20% of A in one subpicture 
time, i.e., approximately 500 ns. 
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3) The steady state (i.e. to within + can 

be. reached within 3 subpicture times. 

Hence, the equivalent horizontal bandwidth for 

A 

Q(t) is about the 500 KHz that is normally allocated 
for commercial color TV broadcasting. The vertical reso- 
lution is slightly worse due to the 2-to-l field lacing. 

The Quantization Table for ^21- ^22 

selected to simulate an equivalent horizontal bandwidth of 
over 1 MHz for I(t), with the assumption that the chrominance 
variation between adjacent picture elements, in the cyan- 
orange red direction, . is relatively small. 
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4 . 0 Data Error Rate Analysis 

The principal advantage of a digital data communi- 
cation system is its adaptability to highly efficient dis- 
crete channel encoder-decoders, where high output signal- 
to-noise ratios can be obtained with relatively low trans- 
mitted energy. This section presents a data error rate 
analysis, pertinent to the video source encoder-decoder 
pairs described in the previous sections . The approach 
will be to estimate the expected number of picture element 
errors per frame for the uplink and downlink source and 
channel coding systems as a function of the energy per 
picture element-to-single sided noise power denisty ratio, 
Ep^l/No* The performances of these systems* are compared 
with that of an uncoded (i.e. , no source or channel coding) 
digital video system and it is shown that significant 

N gains can be achieved with the systems with coding. It 
o 

should be noted that to obtain good quality video such an 
uncoded system would require a channel bit rate in excess 
of that which is presently available. 

First we shall develop expressions for the expected 
number of picture element errors per frame in terms of 
various error probabilities. Then we shall relate these 
error probabilities to present performance 

curves for the uplink and downlink source and channel coding 
systems. 

From the point of view of analysis, an exact mathe- 
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matical model is difficult to prescribe and to analyze 
for these video source encoder-decoder pairs , Thus , in 
the sequel a set of mathematical assumptions are made to 
facilitate the analysis. These are: 

1) We shall distinguish three types of source data: 
the d.c. components, non-d.c. components, and synchronizing 
codewords. The d.c. components are transmitted differen- 
tially either in the two -dimensionally compressed mode or 
the updating mode. Due to the fact that the accuracy of 
reconstructed data is completely dependent on the accuracies 
of previous reconstructed data, this differential method has 
the inherent disadvantages that data errors can propagate 
either horizontally, in the two -dimen si on ally compressed 
mode, or sequentially {i.e. from frame to frame) in the up- 
dating mode. In contrast, the effects of erroneous deci- 
phering of the non-d.c. components are not propagated. In 
the analysis we shall assume the propagation effect is per- 
manent, i.e., in the two-dimensionally compressed mode, a 
d.c. component error causes the subsequent portion of the 
line group to have only erroneous d.c. components, and 
likewise, in the updating mode, it causes the subsequent 
corresponding subpictures to have only erroneous d.c. com- 
ponents, (in other words, we do not allow two or more errors 
to make one correct deciphering) , this is illustrated in 
Figure 4.1. 
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1st subpicture 

X erroneously decoded d.c. components 

shaded areas are considered as error 


FIGURE 4.1 ERROR PROBAGATION EFFECT OF THE D.C. COMPONENTS OF 
A TYPICAL LINE GROUP 



2) Similarly/ if an error occurs in deciphering 
the synchronizing codeword then the entire line group is 
considered to be erroneous. 

3) The video source encoding alphabets are 
"ordered" in the sense that a decoding error in some has 
less effect than others. (For example/ if the trans- 
mitted d.c. component has code representation -1/ then 
the effect of deciphering as 0 is not as bad as that of 
deciphering as +15. Generally speaking/ distance proper- 
ties of the channel code could be utilized so that a larger 
amplitude of decoding error corresponds to lower probability 
of occurrence) . However/ for mathematical simplicity/ we 
shall assume that the deciphered components can be either 
"correct" or "incorrect". This may be interpreted as the 
probability of deciphering a component such that the error 
amplitude is beyond tolerance. Furthermore/ we shall assume 
the events of deciphering the source alphabets are indepen- 
dent. With thiS/ let us denote; 

p = probability of error in decoding a d.c. component/ 
a = probability of error in decoding the non-d.c. 

components of a subpicture/ and 
Y = probability of error in decoding a synchronizing 
codeword . 

P/ a and y very small positive real numbers. 
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4) For the motion detection algorithm, we shall 
assume that reference frames are transmitted once every 
L+1 frames (L is a positive integer) . 

With the above assumptions, we shall divide the 
error analysis into two parts, one for the two-dimension- 
ally compressed data, and the other for the motion detec- 
tion algorithm. Our objective is to obtain general mathe- 
matical expressions for the expected number of subpicture 
error per frame in terms of p, a, y, L and n (where n = 
number of subpictures per line group = 128) . Then the 
performance of each individual video source encoder-decoder 
pair can be determined by substituting the appropriate values 
for p, a, y, and L. 

4 • 1 General Data Error Expression for the Two-Dimension- 
ally Compressed Video Algorithm 

We shall calculate an expression for the expected 
number of subpicture errors per frame , denoted by the symbol 
ERF. First, we shall calculate the expected n\imber of 
subpicture errors per line group, denoted by the symbol, ERL. 
Then, ERF is related to ERL by: 

ERF = 120 ‘ERL subpictures per frame ... (4.1) 

Due to the error propagation effect of the d.c. com- 
ponents, it is necessary to calculate the probability that 
exactly k d.c. components of a line group are correctly 
decoded, given the synchronizing codeword for the line 
group is correctly decoded; we shall denote this by P (k) . 
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From the assumptions given above, we have 
P(k) = pq^ for 0;<k<n-l 

(4.2), 

= for k=n 

where q=l-p 

(It is not difficult to verify that 
n 

^ P(k) = 1.) 
k=0 

7?hen a line group has exactly k correctly decoded sub- 
pictures, (given the synchronizing codeword is correctly 
decoded) , if and only if the line group has k+i correctly 
decoded d.c. components and exactly k correctly decoded 
non-d.c. components, (among the subpictures with correct 
d.c. components), for i=0, 1, 2, ..., n-k. 

Therefore , 

Q(k) = probability that exactly k subpictures of 
a line group are correctly decoded given 
the synchronizing codeword is correct, 
n-k 

i=0 


^ j P(k+i) ••• (4.3), 


where 


$ = 1-a . 


We have 


E 


r number of incorrect sub- 
pictures per line group 


n 


= (n-k) Q(k) 


given the synchronizing 
codeword is correct 

(4.4) 


k=0 

Using the following summation equality: 
n n-k n k 

k=0 j=0 k=0 j=0 

for arbitrary integer function t, we have: 

^ Q(k) = ^ P(lc) = 1 

k=0 


and 


k=0 


n 


n n-k 




(k+i) 


k=0 


k=0 i=0 


= 3 


dx 


n n-k 


Z Z 


x^a^ 


k=0 i=0 


x=3 


= 3 


dx 


n 


P(k) (x+a) 


k 


k=0 


x=B 
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kP(k) 


, k , n 
kq + nq 

A 

= 3q(l-q^)/p 

Hence, (4.4) is equal to 
n-gq(l-q^) /p. 

Therefore, 

ERL = (1-y) j l-3q(l~q^)/p I +ny 
= n-3(i“Y) q(i-q^)/p# 

and 

ERF = 120 ! n - ) subpictures 

( p j pei: frame ... (4.5) 

4. 2 General Data Error Expression for the Motion Detection 

Algorithm 

Likewise, we shall obtain an expression for the ex- 
pected number of subpicture errors per frame. Since there 
are L updating frames for each reference frame, it is clear 
that the expected error gets progressively worse towards 
the L-th updating frame. Let us denote: 

ERF(X-) = the expected number of subpicture errors 
per frame for the Jl-th updating frame 



ERL (5/) = the expected number of subpicture errors 

per line group for the Jt-th updating frame. 
For a=0 , 1 , 2 , . . . , L, 

with il=0 corresponding to the reference frame. 

Here the error propagation effect of the d.c. compo- 
nents not only occurs horizontally in the reference frame, 
but also occurs sequentially in the updating mode. First, 
we have to calculate the probability that exactly k d.c. 
components of a line group, at the £-th frame, are correctly 
decoded, given that the synchronizing codewords up to the 
Jl-th frame are correctly decoded. We shall denote this by: 
k= 0, 1, 2, ... n 

Pn (k) 

£=0,1,2, ...L 

Clearly we have 


PQ{k) = pq 0<k£n-l 
^ k=n 


(4.6) 


= q 

(i.e.„ (4.2)). 

Since a line group of the £-th frame has k correctly decoded 
d,c. components, (given the synchronizing codewords are 
correct), if and only if its precessor, (i.e., at the £-lst 
frame) has k+i correctly decoded d.c. components, followed 
by exactly k correctly decoded d.c. components (among the 
k+i correctly decoded ones), for i=0, 1, 2, ..., n-k, we 
have 


n-k 




k i 


i=0 


(4.7) 


The probability that exactly k subpictures of a line group, 
at the Jt-th frame, are correctly decoded, given the syn- 
chronizing codewords are correct, (as in (4.3)), is given 
by 

n-k 

Q^(k) = ^ Pj^(k+i) ... (4.8) 

i=0 

Therefore, the expected number of subpicture errors per line 
group, given the synchronizing codewords are correct, is 


n 

AiH) = (n-k) Q^(k) ... (4.9) 

k=0 


To obtain a closed form for (4.9) , first we have 


Ml) 


n n-k 

E E 

k=0 i=0 


(n-k) ( (k+i) 


Observe that 


n n-k 


E E m^. 


k=0 i=0 


n k 


E^^<’^>E 

(^) 

-i i 
a 

o 

II 

■H 

O 

11 



n 

n 

n-k 

E = 

E 

E 

11 

o 

o 

II 

H- 

II 

O 



P^-1 (k+i)q^p^ 


77 


n 




(p+q) 


k 


k=0 


n n 

k=0 k=0 

and 


n 

n-k 

(’'f) 

E 

E ^ 

k==0 

i=0 

> 

n n-k 


= e 

dx 

E E ft") ■ 

k=0 i=0 


n 



-E 

k P^(k) . 


k=0 



Since 


n 


k=0 


k P^(k) = 


n-k 
k=0 i=0 


n 

La 


(“) 


a-1 


(k+i) 


IX XX J-k. 

d j 'ST' V' 

^ %K I ^ ^ 
k=0 i=0 


k+i 

i 


P^_i(k+i) X p' 


x= 


n 



k=0 
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k PQ(k) 


k=0 

= q (1-q ) /P/ 
therefore/ we have 

Bq (1-q ) 

Ml) = n - 

P 

Consequently/ 

ERL(Jl)= AU) + ) 

U-q") 

= n — 

P 

and 

( 3[q.U-T)]^^^(l~<3^) ) 

ERF (JO = 120 j n . 

( P ' 


(4.11) . 


(4.12) / 


(4.13) . 


It should be noted that in the above analyses we have 
made the assumption that an erroneous deciphering of the 
line group synchronizing codeword results in only one line 
group error. In reality, the situation is more complicated. 
Generally, such an error will cause the line group supposed 
to be updated to remain unchanged, and, in addition, it also 
causes an erroneous updating on some other line group, which 
may or may be an incorrectly decoded line group. 
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In addition, in the analyses, we have assumed that 
the probability of decoding error for the d.c. components 
of the reference frames is the same as that for the d.c. 
components of the updating frames. Actually, these prob- 
abilities of decoding error are different, because, in 
the updating mode, to decode correctly the components of 
a subpicture the location marker must be correctly decoded; 
whereas, this restriction does not exist in the reference 
mode. In general, if we let, 

= probability of motion detected for a subpicture 
P^ = error probability of decoding the location marker, 
then , we have 

p = probability of error in decoding the d.c. compo- 
nents of the updating frames. 

= P [P + • 

m li L 

Using these values, it is not difficult to verify that 
(4.13) becomes: 

ERF(S,) = 120 jn - B [q"(l-Y)/qy (l-q'^l/pj ... (4.14), 

where q = 1-p 

and q''= l-p'^ . 

4 . 3 Uplink Baseline System Performance 

Normally the performance of a communications system 
with a channel which can be modeled by additive white 
gaussian noise is specified in terms of the channel coding 
information bit energy-to-noise ratio, required for 


a certain error rate. However, when comparing systems 
with source coding and channel coding the performance 
should be specified in terms of the received energy-to- 
noise ratio. Here we will estimate the noisy channel 
performance of video source coding and channel coding 
systems by determining the picture element energy-to- 
noise ratio, they require the achieve a certain 

expected number of picture element errors per frame . Of 
course, the performance of any system with video compres- 
sion also depends on the quality of the reconstructed video. 

The symbol error probabilities p, p'' , a, and y of 
the previous sections can be upper bounded by the product 
of the channel coding information bit error probability, 

Pj^, times the number of bits in the symbol. The resulting 
error probabilities are given in Table 4.1. 

When channel coding is used the errors out of the 
decoder are clustered. However, for small bit error rates 
and the baseline channel coding systems, the bounds of 
Table 4 . 1 represent a very small increase over the 

Ej^/N^’s obtained using the burst error statistics of the 
channel decoder. 


Thus by using these bounds in (4.13) and assuming 
that when a subpicture error occurs, all 16 picuare ele 
ments in the subpicture are in error, we can obtain an 


Error 

Probability 

Error Probability Upper Bounds 

Uplink 

Downlink 

P 

5 1’b 

14 P^ 

■jt 

P 

12 


a 

27 

3 8 P, 
b 

Y 


14 P, 
b 


Table 4.1 Error Probability Bounds 









expression for the expected number of picture element 
errors per frame as a function of the channel coding bit 
error rate, Pj^. For a particular channel coding system, 
this error rate, and thus the expected number of picture 
element errors per frame, can be related to system energy- 
to-noise ratios. 

The baseline uplink system is assumed to consist of 
the following . 

Source Coding 

. Motion detection algorithm 
. 120 X 128 4x4 subpictures/frame 

. 1/8 bit/black-and-white picture element 
Channel Coding 

. Constraint length 7 
. Rate 1/3 

. Information bit rate up to 1 Mbps 
Modulation 

. BPSK or equivalent 

The bit error rate performance of this channel coding 

system has been determined by computer simulation for error 

rates greater than lO”'"^ and by Viterbi's transfer function 

-5 

bounding technique [9] for error rates of less than 10 . 

Since this bounding technique does not take into account 
the receiver quantization loss, 0.2 dB has been added to 
the results obtained with the bounding technique to account 
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for the loss in using 3-bit receiver quantization. The 
high and low error rate results are given in Figures 
4.2 and 4.3, respectively. 

These channel coding results are for a code with 
subgenerators 


1111001 

1100101 

1011011 ... (4.14) 

This choice of generators achieves a free distance of 14 

rather than the maximiim possible free distance of 15. 

However, this code has a smaller bit error rate than the 

codes with a free distance of 15 for bit error rates of 

-1 1 

less than about 10 . 

Figure 4.4 gives uplink performance curves obtained 
using the channel coding results of Figures 4.2 and 4.3, 
and the relationship 


E , IE, 
pel _ ^b 


N 


o 


8 N 


(4.15) 


As a basis for determining the gain achieved 

with this source and channel coding scheme, let us determine 
the expected number of picture element errors versus 
performance of a system with no source or channel coding. 
Assume 7 bits per picture element. Then this system would 
require a data rate of 51.6 Mbps which considerably exceeds 
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Figure 4.4 Noisy channel performance of the uplink, base- 
line source and channel coding system. 
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the 3 Mbps channel bit rate available without channel coding, 
Thus some type of source coding would be necessary. 

With the same assumptions as for the baseline system, 
the expected number of picture element errors per frame 
with this uncoded system is 

’^unooded = 16(128) (12-0) (7Pj,) 

= 1720320 Pt 


where 


Q 


(V 


2E, 


N. 


o 


and 


(4.16) 


(4.17) 


QCy) 


vs/ 

■/ 


exp 


2H 


/ 2 

(-M 


dx 


(4.18) 


The picture element energy-to-noise ratio is related to 
the channel symbol energy-to-noise ratio by 


E 


eel 

N 


7 ^ 
N 


(4.19) 


o o 

Figure 4.5 gives the performance of this uncoded system 
obtained from equotions 4.16 thru 4.19. In -terms of 
this performance measure, this uncoded system is seen 
to be significantly inferior to that of the coded system. 

4 . 4 Downlink Baseline System Performance 

The noisy channel performance of the downlink source 
and channel coding system can be estimated using the same 
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Fiqure 4.5 Noisy channel performance of an uncoded 
uplink digital video communications 
system with 7 bits/picture element. 
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procedure as for the uplink. In this case the baseline 
system is assumed to consist of the following. 

Source Coding 

, Two-dimensional algorithm (i.e., Z=0 only) 

. 120 X 128 4x4 subpictures/frame 

. 3.25 bits/color picture element 

Channel Coding 

. Constraint length 7 
. Rate 1/2 

. Information bit rate up to 50 Mbps. This 
would require parallel decoders for real 
time operation. 

Modulation 

, BPSK or equivalent 

The choice of 3,23 bits per picture element is 
believed to be sufficient for the source coding algorithm 
to produce good quality color video. However, this choice 
only produces a channel encoder information bit rate of 
24 Mbps. Thus a 50% reduction in the channel symbol rate 
with a doubling of the channel symbol energy is possible. 
Another possibility is to concatenate an outer channel 
coding operation to the baseline channel coding scheme. 
This concatenated coding possibility is discussed in the 
next section. 
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Figures 4.6 and 4.7 show the bit error rate per- 
formance of the downlink baseline channel coding system 
and Figure 4,8 gives the noisy channel performance of 
the downlink source and channel coding system. 

Figure 4.9 gives the expected number of picture 
element errors per frame performance of an uncoded digital 
color communications system with 13 bits per picture ele- 
ment. As with the uplink case, the uncoded system exceeds 
the maximum baseline channel symbol rate. Its performance 
is included for comparison purposes only. 

4 . 5 C oncatenated Channel Coding Performance Improvements 

The small bit error probabilities reflected in the 
performance curves of the baseline coding systems indicate 
that concatenated channel coding may be desirable. The 
purpose of concatenated coding is either to obtain a small 
error rate with an overall encoder/decoder implementation 
complexity which is less than that which would be required 
by a single coding operation or to improve the error rate 
perfoirmance of an existing channel coding system. Typical- 
ly the inner code corrects most of the channel errors and 
then a rather simple, high rate, outer code reduces the error 
rate to the desired value. Also interleaving is usually 
required between the coding operations to breakup the error 
bursts out of the inner coding operation. 
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BIT ERROR PROBABILITY 
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E, /N^ in dB 

D O 

Figure 4.6 Bit error probability performance of a 
K=7, R=l/2 convolutional coding system 
obtained by simulation. 
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BIT ERROR PROBABILITY 
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Fiaure 4.7 Small bit error probability performance of 
a K=7 , R=1/2 convolutional codxng system 

obtained from bounds 






EXPECTED NUMBER OF PICTURE ELEMENT ERRORS PER FRAME 
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Figure 4.8 Noisy channel perfomance of the downlink 
baseline source and channel coding system 
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downlink digital video communicatxons 
system with 13 bits/color picture element. 










4.5.1 Concatenated Channel Coding for All the Data 

Concatenanted coding can be used for all the data 
or just the data with more severe reliability requirements. 
First we consider the case where concatenated coding is 
used for all the data. The LINKABIT LF89 feedback-decoded 
convolutional encoder/decoder is a good choice for the outer 
coding in such a system. This encoder/de coder has a con- 
straint length 8 rate 3/4 convolutional encoder and a 
simple feedback decoder which could operate on the hard 
decision outputs of the inner Viterbi-decoded convolutional 
coding system. The LF89 is also available with internal 
interleaving/deinterleaving . 

Figure 4.10 compares the bit error rate performance 
of the baseline uplink coding system with that of a con- 
catenated coding system using the LF89 coding as an outer 
code and the baseline K=7 , R=l/3 convolutional coding as 
the inner code. In the error probability range of Figure 
4.10, the performance of the LF89 coding can be approximated 
by 

bit 2000 P. ^ ... (4.20) 

E - in 

where P. is the bit error rate into the LF89 decoder (i.e. , 
in 

out of the Viterbi decoder) . The is also adjusted 

to account for the information bit energy change caused 
by the lower rate of the concatenated coding system. 

Figure 4 . 10 shows that this concatenated coding 
system can provide significant energy-to-noise ratio gains 
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Figure 4.10 Uplink concatenated coding bit error rate 
performance 











when small bit error rates are required. For reference, 

—10 

a 10 bit error rate produces an expected number of 
picture element errors per frame of about .009 for the 
uplink system with 1 = 0 . 

To accomodate this concatenated coding scheme on 
the uplink, it would be necessary to reduce the number of 
bits per picture element produced by the source encoder 
by the factor .75 to maintain the same approximate 1 Mbps 
data rate into the inner encoder. This will degrade the 
picture quality, especially when motion occurs. However, 
if the picture quality is still acceptable the additional 
source compression will provide an gain of 1.25 dB. 

Thus the total gain is the dB gain shown in Figure 4.10 


plus 1.25 dB. At an expected number of picture element 


errors per frame of 0.01, 
dB. 


this total 

pel o 


gain is 2.65 


Concatenated coding can also be used on the downlink. 
Performance improvements similar to those of Figure 4.10 
can be obtained with the downlink K=7, R=l/2 baseline sys- 
tem as the inner code and the LF89 (or several in parallel 
to achieve the high data rates) as the outer code. On the 
downlink the source coding can remain unchanged since the 


inner code can operate up to 50 Mbps. Thus the 

gain is equal to the channel coding gain achieved 


by the concatenated coding system over the baseline system. 



4.5.2 Concatenated Coding for Only the Esference Symbols 
When a small fraction of the data has more severe 
reliability requirements than the other data, the overall 
system performance can sometimes be improved by concatenat- 
ing an outer code to the channel coding system for only the 
higher reliability data. The advantages of using concate- 
nated coding on only a small fraction of the data rather 
than on all the data are that the channel symbol rate in- 
crease is much smaller and the power of the outer code can 
be used where it is most needed. 

In the uplink video communications system described 
here an error in a reference symbol is more serious than 
an error in an updating symbol. So as an example of this 
type of concatenated coding, let the concatenated coding 
be used on only the reference symbols. For the uplink sys- 
tem these reference symbols represent approximately 15% of 
the data presented to the channel coding system. 

When the outer coding is an LF89 feedback-decoded 
convolutional coding system the expected number of picture 
element errors per frame can be related to the bit error 
rate of the baseline channel coding system by replacing 
the updating symbol error probability approximation of 
Table 4 . 1 by 

p = 5 (2000 




The 5% average symbol rate increase required by this 
system can be accommodated by the baseline system with- 
out any changes in the video compression ratio. Also 
this small symbol rate increase only reduces the inner 
code Ej^/N^ by 0.2 dB . 

The expected number of picture element errors 
per frame of this concatenated system is about 0.1 of that 
obtained with the baseline uplink system (see Figure 4.8). 
This performance improvement corresponds to an 
gain of abc. t 0.5 dB. Subtracting the rate reduction loss 
produces a net gain of only 0.3 dB. So this concatenated 
system requires a larger than the concatenated sys 

tem of the previous section. Of course, this concatenated 
system does not require a larger source compression ratio. 
Thus this system may be preferable to the previous system 
when picture quality is considered. 


5 . 0 Computer Simulation 

A Fortran program is written to simulate the 
motion detection technique. The simulation was perform- 
ed on the LIM Video Controller, whose functional block 
diagram is illustrated in Figure 5,1, To process a se- 
quence of pictures, the signal is first recorded on the 
video disc recorder, Ampex model DR-10, (up to 300 frames 
of video signal, or equivalent to 10 seconds of realtime 
video signal, can be processed) , then the LIM Video Con- 
troller processes the signal by extracting 4 consecutive 
horizontal lines at a time from the video disc recorder. 
It A-to-D converts the analog signal, and transports the 
digitized data to the Meta IV CPU for data manipulation. 
The processed data are then returned to the LIM Video 
Controller, D-to-A converted back into video signal, 
and recorded on the video disc recorder. Due to the 
inherently slow mechanical movements of the record/ 
playback heads, the reference signal and the processed 
signal are recorded at alternate tracks and by alternate 
record/playback heads. For example, the odd mambered 
reference frames and the even numbered processed frames 
are recorded by record/playback head A, (which controls 
the top side of the disc) , while the even numbered refe- 
rence frames and the odd numbered processed frames are 
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10 2 






















recorded by record/playback head B, (which controls the 
bottom side of the disc) ; this is illustrated in Figure 
5.2. Due to the fact that the record/playback heads are 
not perfectly matched, signal variations are resulted in 
terms of different signal gains and nonuniform intensity 
changes between heads. Since the motion detection algo- 
rithm uses a threshold comparison method, which is sensi- 
tive to the stability drifts, therefore, these together 
with other signal variations resulted from the long term 
drifts of A-to-D and D-to-A converters, video amplifiers, 
and timing generator (these intrinsic hardware drifts are 
very small, yet, due to the long processing time between 
frames their effects are not negligible in simulating motion 
detection) should be taken into account for a more accurate 
interpretation of the effectiveness of the algorithm. The 
above signal variations have resulted excessive motion re- 
gions detected and some unwanted noises, especially when 
the threshold is set to a high level.- To obtain a closer 
estimation on the number of detected motion regions, 
another Fortran program is written where the threshold 
comparisons are made between corresponding motion regions 
of the adjacent fields. By doing so, the field 2 motion 
detection is obtained by comparisons of video signals 
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SPINDLE 



Record/playback head B 


1 , 2 , 3 , , . . , etc : input reference frame number 

1', 2', 3', , etc; processed frame number 


Figure 5.2; Typical Geometrical Orientation of 
Reference and Processed Frames 


of two fields that are obtained by the same record/ 
playback head, while that of field 1 is obtained by 
different record/playback heads. Our experiments have 
shown that the number of motion regions detected in field 
1 is, in general, substantially higher than that of 
field 2. This substantiates our claim on the effect 
of signal variations due to different record/playback 
heads. However, it should be emphasized that the field- 
to-field motion detection algorithm mentioned above is 
not desirable, from quality point of view, for the reason 
that this method induces extra detected motion regions 
due to the location errors of motion regions between ad- 
jacent fields. In particular, if the scene contains hori- 
zontal objects the algorithm causes certain oscillations 
about the horizontal edges,,* because here the algorithm at- 
tempts to correct the errors due to unmatched locations. 
The sole purpose of the f ield-to-f ield experiment is to 
obtain a closer estimate on the number of motion regions 
detected. 

The Fortran program for the (Frame-to-Frame) motion 
detection is given in the following pages . Most of the 
Fortran subroutines are described in the LIM Video Con- 
troller Operator's Manual [6] or in "Study of Efficient 
Video Compression Algorithm for Space Shuttle Application" 
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OBIGINAL PAGE 
)P POOR QUALI 


HAGt 


1 



// vJOO 

LUb* DKIVL CAHT SPt(T“ ■ cart AVriU- PHY HRIVE"" ' 

ouou uloo (lino ooni 

U0A2 0002 

- • - - - 0003 

cccc ooon 

V2 FiOB ACTUAL" 16K ‘ CUKFIG l6K " ’ 

// HOH . • . ■ 

*Li:^T ALL ■ : 

*Ul\L UCRC l^TE(iE■.R5j 

tlULSClMOS FfaATF.K«2?0l RC ACCf^ « DISK , 1 YPfV'K ITcR » »'EYbL AHC ) 
IMEGFR' UDl (mC2) »JD2(‘a6?)‘«DFl(20‘49,2) »DF2(2U49,2) 
IMFGER LL2(32 ) . 1LL2 (3?) »Pri(15 t4 ) , rrvN{l5,H) ,KLx (15 ) 
IfUEGER LHaR (16 lie ) iDIgIT (3) tn\ILH{2) «ECF,UOF 

IMCGtR* FPT(12S) - - - - 

Etc l\/AL£^CE (^DlUSiEFldil))* (JD2(D iBF 2<1»D J 
DEFI^E FILE K 78U»320«l:*K> ' • 

LLFIAE FILE 3(40»256iL,K) 

L'LFILE FILE 4 ( 1 3U » 32 0 i L ♦ h ) 

CATA FiMLr'/i ,4/, JU1(4162) • JD2<'^l#2)/2*4ieo/ ■ 

f;ATA CIG1T/44U ,420 ,4C0/,LEV1/50/,LEV2/100/ 

LP1A Flir'2,F3iF4/L,2,2,l/ . _ 

TATA FPT/125<0/ '• ■ 

CALI iIMT(FLL[i' ,2ilFG ) 

HLAC (2* 1 ) BF 1 . 

■ ■ CALL 2WR1T{2i2iJU1(4162)) " 

HLAE (0,100) ECF»^LG,^Fl'^'*IFb,l-iT,ITHHS,IS^OP 
WRITE (1»200) E OF f tvLG » F F Hf , IF G • P T , n HRS , I S TOP 
“■‘ IF (FT) A96,39E,395 ; 

395 CALL >PCbF (F'l,r2«F3,F4) 

396 OCFrECF-H^-LLG + H 

CALL H'1CUr{P3,F4 ) ’ 

HLAP (3,100 ) RGX 

WHITE (1,200) FOX _ 

KLAC {3,10(1)' GL2" " ' ‘ • 

HLAC (v3,10(l ) 1GL2 



107 


HAGt 2 


ll* UFG) 

m3 UNITE tl«333) 

V.NiTE (1*200) GL2' 
unite (lf.?50) IUL2 
*411 CCMirvUE 

CALL ZTElsT{2»lxil) ' " 

HLAC (2*1M) BFl 

CALL 2lJRlT(2tliJUl(4i6?) ) 

CL ses 1 = 3 ,H . r 

NLAC (0*100) (CMJtl) .j=3 ,35) 

HLAC (8*100) UwK( J,I ) ,v, = r*lb) __ _ _ • 

IN (1F6) 585.be5*b89' " 

bbgUniTL (1*200) ( CM J , I ) , w = 1 * 3 b ) 

WNItl (1*250) UUr^(J,I) ,C=1 ,15) 

UMTL (1*335) ' " 

5tt5 CLLTir.UE 

CALL ZrCbT (2t lFoja) _ 

unite (5«48S) -■ ■ •” 

cc pee ii-R(v=itEFKr, 

UCFG=G 

KNCrl 

CL POC INr=lt2 

CALL FATLrV(IFf',LOF ,1) . _ _ _ 

ILrv = LCr-*4 " ^ - - - 

CL PDC ILOrltALG 
ILFX = r'OC 1 IFHK-1 *bli ) 

CALI ZTEbT(l*ir-Gtl) ■ 

CL IC'Cl ir = i*2 

_ bNl (2C-'t5 * IP )=2C48 _ __ 

IN (ICFX) 672*671.672 /' 

6/2 call ZREAF (l«M4C.w'D2(*4l62) ) 

6V1 INGr^L3-lLG-lN42 

CALL FJMLrv ( 0 *biF 1 ( 2u*49 »IF )') ' ’ ' ' 

IN (ING) 797.79/.750 
7V3 CALL FATLP( IFP- , ILL , 1 ) 

1LL=1LN-M ■ •' - - - 

797 CALL S9hr (DKl(2C4t;, IP) ,BF3 (lb36, IP) ,512.128) 
CALL Srif'2([?Fl(20*4e,lP) ,|3F1 ( 1C2H. IF ). 102*4 ) _ 

lOUl CLM ILUE ’ 

CALL 2rEbT(l.lFt»*l) 
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UC 10M7 iPrlfZ- 
It (ICFXI lOUtJf 10U6,10 o5 

1006 CALL ^'EA^(125»FFl(20^♦e,IF■)■VPFl(^tIF)^0) " “ 

CALL CBr.<>T(CL2(32) ,3 ?,tGL?( 32) . BH (?04A,1P> ,12b, BFl ) ) 

C-CTC 103/ 

■ 10U5 IHX = 2 ■ ■“ ■■ ■ ■ ' " 

1^=C 

11f = 2048 , . 

CC 1P20 FTrl-,126 ' ” " 

ll-G = lSS(t5Fl( IT »1H> ,,BF2( Ilf ,TH) ) 

IJ- ( lf-G-lTHf<S) 1031,1032 f1032 

■ 10^2 i“FT(inxi = ix — ■ ' ■ : — 

lhXrlKX+1 
1X = 4 

~ CCTC 103Q " - ■ ^ 

lOdi ix=ix-m 
1030 = 

■” “ IMX=1KX-2 ^ 

i“HT<l> = IKX 
^LFG=^CKO+IHX 

IH ( IFX) ■1047T10‘t7ri04a I " ~ 

lOHfi iLFX=-HO 

CALL IOIFF(Rf-l (20t»o,iP) ,EF2(2C4e,iri tFFT (1) » 

“CALL IDr\WT{0L2(32) ,32,iGL2(32) ,FPT,(T) ,DFH204BTIF) ) ‘ 

CALL 1REM-;(BF2(2U‘18.IF) ,BF1 (2C48ilP) ,FPT (1) » 
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IX = f> GX ( I J ■ " • . - 

iFGrM-I/4 • 
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GLIC £37 • 
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bb7 CCMIAUE 
10H7 CL^T IFUE 

IF (ICFXJ 622',621,B0G " ' ^ - „ . 
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CALL L MCLF (r':l»P2) 
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CALL 
CALL 
CALL 
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CALL 2TE6T ( rVIFC-1^3 ) 
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CALL aKEAP(1,ILG.JD1{4i62) ) 
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call 2READ(2, i . JU1(M16? ) > 
GCTC 773 

7/2 CALL ZTEisT(2» 1FG»3 ) 
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773 CC^TI^UE 
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Call fotc’a ( im“ » -a , 0 ) 
call FDTCAt 1FI“ ,M »0) 
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I=1FKF 
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lA = E12*{i3-lLK )-Dlt IT( IL6) 
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DC 651 IFrr3,2 


HAGt 


6 


It 


E3H (2C4g»IFr )=20‘*6 
Bh2(2045*IFF.)=20He 
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Final Report [l] , otherwise, they are described as follows: 

1) XPOSE (Ml, M2, M3, M4) : 

exchange Ml and M2, and 
exchange M3 and M4 

Let A, B, Q, IQ and M be arrays, and let Nl, 

N2, N3, N4, N5, N6 and M(N4) be positive integers. 

2) IDIFF, IKENEW (A(Nl), B (N2) , M(N4)): 

These operations replace the set of numbers : 
A(Nl) , A(N1+M(j)) 

by the differences (for IDIFF) , or the sums 
(for IRENEW) : 

A(N1)+B(N2), A(Nl+M(j)) ± B(N2+M(j)), 

for j=N4+l, N4+2, ..., N4+M(N4) . 

3) IBNQT, or ILNQT(Q(N1), N2 , IQ(N3), M(N4) , A(N5) ) 

These operations replace the set of numbers : 
A(N5) , A(N5+M(j)) for j=N4+l, ..., 

N4+M(N4) 

by their inverse quantized values. The quanti- 
zation cutpoint table is stored in the array 
Q(N1) ^Q(N1+1)> ..., Q(Nl+N2-2), 
and the inverse look up table is stored in the 
array: 

IQ(N3), IQ(N3+1), ..., IQ(N3+N2-1). 

The niimber of quantization levels is N2 . IBNQT 
uses a binary search method, while ILNQT uses 
a linear search method. 
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4) MBNQT, or MLNQT (Q(Nl), N2, IQ(N3), M(N4) , 
A(N5) , B(N6)) 

These operations are almost identical to 
(3) / except the inverse quantized values 
are put in the array: 

B(N6), B(N6+M(j)), j=N4+l,..., 

N4+M(N4) , 

(i.e., the old data are not destroyed). 

5) ISS (Ml, M2) = I Ml -M2 I , 
for any integers Ml and M2. 


5 . 1 Video Tape Presentation 

The results of video simulations on the motion detec- 
tion algorithm are recorded on a video tape which is accom- 
panied with this report. Vc^rious input video sequences of 
different types of motions are experimented. Each sequence 
is identified by a sample reference numbered which is indi- 
cated at the beginning of each sequence and at the upper 
left-hand corner of each frame. (The number indicated at 
the upper right-hand corner of each frame is the relative 
frame number of each sequence.) The percentages of motion 
regions detected and the corresponding source rates achiev- 
able without abridging are shown in Table 5.1. 

The video tape was recorded by a Sonny video tape 
recorder, model AV3560, which, unfortunately, contributes 
a significant amount of recording noise and synchronization 
instability. The synchronization instability is particularly 
noticeable and objectionable at the beginning of each record- 
ing sequence. To enhance observation detail, each sequence 
is instant replayed at a slow motion rate df 1:5. 

It is worthwhile to emphasize that a substantial por- 
tion of motion regions detected is caused by the simulation 
noises, such as the signal variations between heads, long 
term hardware stability drift, etc., which are mentioned in 
the previous section. In a well implemented realtime sys- 
tem, their effects are negligible; hence, a much improved 
video quality and efficient source data rate can be expected. 
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Sample 

Reference 

Percentage of Motion 
Detected 

Regions 

Achievable source rate 

without sequence abridging * 

** 

Min. 

Max. 

Average 

I 

11% 

28% 

15% 

Si .62 bit/pel 

II 

8% 

20% 

12% 

.46 bit/pel 

III 

10% 

25% 

17% 

- .56 bit/pel 


* assuming 1/16 bit/pel is used to implement the synchronization codewords. 
** much lower source rate can be achieved with some sequence abridging 


Table 5.1 Percentage of Motion Regions Detected 
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